HDDS-9426. Calculate Exclusive size for deep cleaned snapshot's deleted directories. #5579

aswinshakil · 2023-11-10T01:48:45Z

What changes were proposed in this pull request?

In #5339, We expanded the snapshot's deletedDirTable into fileTable to calculate the exclusive size for the files under the deleted directories. The problem with that approach is that if we alter the fileTable and dirTable of a snapshot, it would change the snapDiff result and yield undesirable results. For eg, Consider a deleted directory structure as below,

dir1 --- subdir1 --- file1 to file100
    | 
     ---- subdir2 --- file1 to file100

Instead of snapDiff saying DELETED dir1, It would say
DELETED subdir1
DELETED subdir2
DELETED subdir1/file1 to file100
DELETED subdir2/file1 to file100

In the new approach in this patch, We do an in-memory walk for each deleted directory and deep clean them, and also calculate the exclusive size for files when doing so. For each file, we do the same as #5301

The reason we are doing a deep clean, There could be files under a directory trapped in between snapshots. Consider this case,

Create dir1
Create snapshot1
Create 1000 files under dir1 --> These files are trapped and not deleted unless the snapshot itself is deleted.
Delete dir1
Create snapshot2

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9426

How was this patch tested?

Added Integration test.

…ed directories.

hemantk-12

Thanks for the patch @aswinshakil

hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto

...ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryService.java

…26-dirsize

hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/OMConfigKeys.java

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java

...anager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotPurgeRequest.java

...ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/KeyDeletingService.java

...nager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java

hemantk-12

Overall changes look good to me. Left some minor and cosmetic comments.

We can merge this once you resolve them.

...nager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java

...tion-test/src/test/java/org/apache/hadoop/ozone/om/TestSnapshotDirectoryCleaningService.java

hemantk-12 · 2023-12-11T06:55:40Z

...tion-test/src/test/java/org/apache/hadoop/ozone/om/TestSnapshotDirectoryCleaningService.java

+    try {
+      count = cluster.getOzoneManager().getMetadataManager()
+          .countRowsInTable(table);
+      LOG.info("{} actual row count={}, expectedCount={}", table.getName(),


This log statement is OK for debugging but not needed to be added.

...nager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java

hemantk-12

Thanks for updating it @aswinshakil

LGTM except the seek and next as in comment: https://github.com/apache/ozone/pull/5579/files#r1413457354

...nager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java

hemantk-12

LGTM.

swamirishi

@aswinshakil Thanks for the patch. I have a few major review comments after going through the PR so far. I haven't gone through the entire code and logic yet. These are some of the review comments you can address in the meanwhile I review the rest of the PR.

hadoop-hdds/common/src/main/resources/ozone-default.xml

...nager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java

swamirishi · 2023-12-14T11:05:50Z

...nager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java

+          }
+        }
+
+        if (directoryIterator.hasNext()) {


Can we move this loop into the else condition. In one iteration we will either delete files in the directory or process the sub directory. Let us not do both. Then we wouldn't be requiring the next.next in the else loop condition. It makes the logic complicated.

We need the current directory reference to add the next sub-dir to the stack. So it is simpler to do it in the same iteration than the next iteration.

...nager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java

swamirishi · 2023-12-14T12:54:51Z

...one-manager/src/main/java/org/apache/hadoop/ozone/om/service/AbstractKeyDeletingService.java

+      // If the previous to previous snapshot doesn't
+      // have the key, then it is exclusive size for the
+      // previous snapshot.
+      if (keyInfoPrevToPrevSnapshot == null) {


We are supposed to look for the version diff, in case of object versioning. Exclusive size would be the size of the versions not found in the previous to previous and found in the previous snapshot.

Updated it.

where is the version size being checked?

One version of the object would correspond to only one snapshot we need to check which version this is. It could be the case the key was modified and we are deleting previous versions or we are keeping all object versions.

…26-dirsize

swamirishi

Added some suggestion. You may take it if it is helpful

...nager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java

swamirishi · 2023-12-19T08:59:08Z

...nager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java

+                  stackTop.getDirValue().getObjectID(), "");
+          stackTop.setSubDirSeek(seekDirInDB);
+        } else {
+          // Adding \0 to seek the next greater element.


@hemantk-12 Can you confirm if adding "\0" is a correct approach. @aswinshakil it would be great to add this edge case in the unit test if it is not there. From what I know this should work since utf-8 encoding is orders strings lexicographically even in case of \0.

It looks good to me. As you said, we have to test it thoroughly.

swamirishi

@aswinshakil thanks for the patch. I have raised a follow up jira for fixing the exclusive size calculation for buckets with object versioning enabled. https://issues.apache.org/jira/browse/HDDS-10051

aswinshakil · 2024-01-03T16:43:26Z

Thank you @hemantk-12 and @swamirishi for the reviews!

…'s deleted directories. (apache#5579)" This reverts commit d969689.

…'s deleted directories. (#5579)" (#6051) Reason for revert: incompatible proto changes This reverts commit 0528494. This reverts commit d969689.

…'s deleted directories. (apache#5579)" (apache#6051) Reason for revert: incompatible proto changes This reverts commit 0528494. This reverts commit d969689.

…shot's deleted directories. (apache#5579) Change-Id: Ibea7794613f1702fb478fd70eeef7555255d697a

HDDS-9426. Calculate Exclusive size for deep cleaned snapshot's delet…

22ef315

…ed directories.

aswinshakil added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label Nov 10, 2023

aswinshakil requested review from hemantk-12 and swamirishi November 10, 2023 01:48

aswinshakil self-assigned this Nov 10, 2023

aswinshakil added 2 commits November 13, 2023 11:01

HDDS-9426. Fix findbugs.

c8bc2d5

HDDS-9426. Fix UT.

6cea41c

hemantk-12 reviewed Nov 27, 2023

View reviewed changes

aswinshakil added 3 commits November 28, 2023 12:50

HDDS-9426. Address Review Comments.

983a9ad

Merge branch 'master' of https://github.com/apache/ozone into HDDS-94…

e525cf7

…26-dirsize

HDDS-9426. Fix UT.

671df96

hemantk-12 reviewed Nov 30, 2023

View reviewed changes

HDDS-9426. Address Review Comments.

3894fb0

hemantk-12 reviewed Dec 4, 2023

View reviewed changes

HDDS-9426. Move variable declaration.

bc6685b

hemantk-12 reviewed Dec 11, 2023

View reviewed changes

HDDS-9426. Address Review Comments.

f6f50d9

hemantk-12 reviewed Dec 12, 2023

View reviewed changes

...nager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java Outdated Show resolved Hide resolved

hemantk-12 approved these changes Dec 13, 2023

View reviewed changes

swamirishi requested changes Dec 14, 2023

View reviewed changes

aswinshakil added 2 commits December 14, 2023 16:01

Address Review Comments.

85c4886

Merge branch 'master' of https://github.com/apache/ozone into HDDS-94…

7a6a76e

…26-dirsize

swamirishi reviewed Dec 14, 2023

View reviewed changes

...nager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDirectoryCleaningService.java Outdated Show resolved Hide resolved

Address Review Comments.

ef46221

swamirishi reviewed Dec 19, 2023

View reviewed changes

swamirishi approved these changes Jan 2, 2024

View reviewed changes

aswinshakil merged commit d969689 into apache:master Jan 3, 2024

adoroszlai added a commit to adoroszlai/ozone that referenced this pull request Jan 22, 2024

Revert "HDDS-9426. Calculate Exclusive size for deep cleaned snapshot…

49abf4f

…'s deleted directories. (apache#5579)" This reverts commit d969689.

adoroszlai mentioned this pull request Jan 22, 2024

HDDS-10180. update proto.lock for Ozone 1.4.0 #6044

Closed

adoroszlai mentioned this pull request Jan 22, 2024

Revert "HDDS-9426. Calculate Exclusive size for deep cleaned snapshot's deleted directories. (#5579)" #6051

Merged

aswinshakil mentioned this pull request Jan 26, 2024

HDDS-9426. Calculate Exclusive size for deep cleaned snapshot's deleted directories #6099

Merged

swamirishi pushed a commit to swamirishi/ozone that referenced this pull request Dec 3, 2025

CDPD-62286. HDDS-9426. Calculate Exclusive size for deep cleaned snap…

1ec6d41

…shot's deleted directories. (apache#5579) Change-Id: Ibea7794613f1702fb478fd70eeef7555255d697a

HDDS-9426. Calculate Exclusive size for deep cleaned snapshot's deleted directories. #5579

HDDS-9426. Calculate Exclusive size for deep cleaned snapshot's deleted directories. #5579

Uh oh!

Conversation

aswinshakil commented Nov 10, 2023

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

hemantk-12 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hemantk-12 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hemantk-12 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hemantk-12 left a comment

Choose a reason for hiding this comment

Uh oh!

swamirishi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

swamirishi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

swamirishi Dec 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

swamirishi Dec 19, 2023 •

edited

Loading